Generalizing prosodic prediction of speech recognition errors
نویسندگان
چکیده
Since users of spoken dialogue systems have difficulty correcting system misconceptions, it is important for automatic speech recognition (ASR) systems to know when their best hypothesis is incorrect. We compare results of previous experiments which showed that prosody improves the detection of ASR errors to experiments with a new system and new domain, the W99 conference registration system. Our new results again show that prosodic features can improve prediction of ASR misrecognitions over the use of other standard techniques for ASR rejection.
منابع مشابه
A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملPredicting User Reactions to System Error
This paper focuses on the analysis and prediction of so-called aware sites, defined as turns where a user of a spoken dialogue system first becomes aware that the system has made a speech recognition error. We describe statistical comparisons of features of these aware sites in a train timetable spoken dialogue corpus, which reveal significant prosodic differences between such turns, compared w...
متن کاملImproving prosodic phrase prediction by unsupervised adaptation and syntactic features extraction
In the state-of-the-art speech synthesis system, prosodic phrase prediction is the most serious problem which leads to about 40% of text analysis errors. Two optimization strategies are proposed in this paper to deal with two major types of prosodic phrase prediction errors. First, unsupervised adaptation method is proposed to alleviate the mismatching problem between training and testing. Seco...
متن کاملPrediction of American listeners’ misrecognition of English words spoken by Japanese
This study tries to automatically estimate the probability of individual spoken words of Japanese English (JE) being perceived correctly by American listeners and to clarify what kind of combinations of segmental, prosodic, and/or linguistic errors are more fatal to the correct recognition. Firstly, from a large speech database of JE, a balanced set of 360 utterances of 90 male speakers were se...
متن کاملStudy on Detection of Prosodic Phrase Boundaries in Spontaneous Speech
Prosodic information, which has the abilities of disambiguation, improving the parsing of the spoken language and predicting recognition errors, becomes more and more important in speech recognition and understanding, especially in spontaneous speech. In this paper, we investigate the detection of the phrase boundaries by prosodic features in the domain-specified Chinese spontaneous speech. The...
متن کامل